Fork me on GitHub

HTTP hijack

今天在学习标准库net/rpc时,看rpc基于HTTP的那段内容,发现了一个有趣的地方。

RPC OVER HTTP,只是使用HTTP协议来建立连接,连接建立后,就没HTTP啥事了。这里有个术语叫hijack(劫持)。

RPC OVER HTTP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
func main() {

// new 一个 Arith 对象
arith := new(Arith)
// 将 arith 对象注册进rpc default 服务
rpc.Register(arith)
// 将rpc服务注册到HTTP协议上
rpc.HandleHTTP()

// http 负责监听端口
err := http.ListenAndServe(":1234", nil)
if err != nil {
fmt.Println(err.Error())
}
}

1
2
3
4
rpc.HandleHTTP()

--->
里面其实也是将rpc server 作为实现了 http.Handler 接口的 handler(负责处理请求),传递给http包下的缺省 servermux 路由管理器

RPC server 实现了 http.Handler 接口,也就是有ServerHTTP方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
    // ServeHTTP implements an http.Handler that answers RPC requests.
func (server *Server) ServeHTTP(w http.ResponseWriter, req *http.Request) {
if req.Method != "CONNECT" {
w.Header().Set("Content-Type", "text/plain; charset=utf-8")
w.WriteHeader(http.StatusMethodNotAllowed)
io.WriteString(w, "405 must CONNECT\n")
return
}
conn, _, err := w.(http.Hijacker).Hijack()
if err != nil {
log.Print("rpc hijacking ", req.RemoteAddr, ": ", err.Error())
return
}
io.WriteString(conn, "HTTP/1.0 "+connected+"\n\n")
server.ServeConn(conn)
}

ServerHTTP 处理了HTTP请求的业务逻辑,它首先处理HTTP的CONNECT请求,接收后就Hijack这个请求,然后将请求conn 扔给 ServerConn去处理。可以看出,net/rpc 只是利用 HTTP CONNECT 建立连接,这与普通的RESTful api 还是不一样的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
type Hijacker interface {
// Hijack lets the caller take over the connection.
// After a call to Hijack the HTTP server library
// will not do anything else with the connection.
//
// It becomes the caller's responsibility to manage
// and close the connection.
//
// The returned net.Conn may have read or write deadlines
// already set, depending on the configuration of the
// Server. It is the caller's responsibility to set
// or clear those deadlines as needed.
//
// The returned bufio.Reader may contain unprocessed buffered
// data from the client.
//
// After a call to Hijack, the original Request.Body must not
// be used. The original Request's Context remains valid and
// is not canceled until the Request's ServeHTTP method
// returns.
Hijack() (net.Conn, *bufio.ReadWriter, error)
}

调用Hijack()方法,会将HTTP对应的TCP链接取出来,取出来之后,HTTP 服务就不在管这个链接了,需要由调用方去管理了。

使用了Hijack之后,http的响应有什么不同呢

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
func main() {
http.HandleFunc("/", handler2)
_ = http.ListenAndServe(":8008", nil)
}

func handler1(writer http.ResponseWriter, request *http.Request) {
hijacker,_ := writer.(http.Hijacker)
conn, buf, _ := hijacker.Hijack()
defer conn.Close()
_, _ = buf.WriteString("hello world")
_ = buf.Flush()
}

func handler2(writer http.ResponseWriter, request *http.Request) {
_, _ = fmt.Fprint(writer, "hello world")
}

handler1的响应

1
2
$ curl -i "http://localhost:8008/"
hello world%

handler2的响应

1
2
3
4
5
6
7
curl -i "http://localhost:8008/"
HTTP/1.1 200 OK
Date: Fri, 24 Apr 2020 15:43:26 GMT
Content-Length: 11
Content-Type: text/plain; charset=utf-8

hello world%

可以看出, Hijack 后 response header

1
2
3
4
5
6
7
8
9
10
func (c *conn) serve(ctx context.Context) {
...
serverHandler{c.server}.ServeHTTP(w, w.req)
w.cancelCtx()
if c.hijacked() {
return
}
w.finishRequest()
...
}

这是net/http包中的方法,也是http路由的核心方法。调用ServeHTTP(也就是上边的handle方法)方法,如果被hijack了就直接return了,而一般的http请求会经过后边的finishRequest方法,加入headers等并关闭连接。

在Go中,Hijack方法的使用场景有两种

  1. 基于HTTP的RPC
  2. 从HTTP协议升级到WebSocket